Automatically Annotating and Integrating Spatial Datasets
نویسندگان
چکیده
Recent growth of the geo-spatial information on the web has made it possible to easily access a wide variety of spatial data. By integrating these spatial datasets, one can support a rich set of queries that could not have been answered given any of these sets in isolation. However, accurately integrating geo-spatial data from different data sources is a challenging task. This is because spatial data obtained from various data sources may have different projections, different accuracy levels and different formats (e.g. raster or vector format). In this paper, we describe an information integration approach, which utilizes various geo-spatial and textual data available on the Internet to automatically annotate and conflate satellite imagery with vector datasets. We describe two techniques to automatically generate control point pairs from the satellite imagery and vector data to perform the conflation. The first technique generates the control point pairs by integrating information from different online sources. The second technique exploits the information from the vector data to perform localized image-processing on the satellite imagery. Using these techniques, we can automatically integrate vector data with satellite imagery or align multiple satellite images of the same area. Our automatic conflation techniques can automatically identify the roads in satellite imagery with an average error of 8.61 meters compared to the original error of 26.19 meters for the city of El Segundo and 7.48 meters compared to 15.27 meters for the city of Adams Morgan in Washington, DC.
منابع مشابه
Automatically Annotating Text with Linked Open Data
This paper presents and evaluates two existing word sense disambiguation approaches which are adapted to annotate text with several popular Linked Open Data datasets. One of the algorithms is based on relationships between resources, while the other one takes advantage of resource definitions provided by the datasets. The aim is to test their applicability when annotating text with resources fr...
متن کاملAutomatic Metadata Annotation through Reconstructing Provenance
Annotating datasets with metadata is an important part of organizing and curating data. However, it is a time consuming process and often not done in a rigorous fashion. In this paper, we propose a new approach to annotating datasets through the use of reconstructed provenance. A detailed survey of the related work in this area is given. Additionally, we provide an overview of our approach for ...
متن کاملIntegration of Multi-Source Spatial Datasets Via Web Services Semi-Automatically and Ontology Based
This paper focuses on integration of multi-source datasets which often occur in spatial planning process, land administration and management, management of resources and site selection in Iran. In many cases in Iran; land records are available as textual data (definition of parcels in title deeds), vector datasets, aerial photos and satellite images in different forms. The problem is that there...
متن کاملAnnotating Inter-Sentence Temporal Relations in Clinical Notes
Owing in part to the surge of interest in temporal relation extraction, a number of datasets manually annotated with temporal relations between event-event pairs and event-time pairs have been produced recently. However, it is not uncommon to find missing annotations in these manually annotated datasets. Many researchers attributed this problem to ”annotator fatigue”. While some of these missin...
متن کاملHandSeg: A Dataset for Hand Segmentation from Depth Images
We introduce a large-scale RGBD hand segmentation dataset, with detailed and automatically generated highquality ground-truth annotations. Existing real-world datasets are limited in quantity due to the difficulty in manually annotating ground-truth labels. By leveraging a pair of brightly colored gloves and an RGBD camera, we propose an acquisition pipeline that eases the task of annotating ve...
متن کامل